Hybrid Deep Learning Model for Sarcasm Detection in Indian Indigenous Language Using Word-Emoji Embeddings

نویسندگان

چکیده

Automated sarcasm detection is deemed as a complex natural language processing task and extending it to morphologically-rich free-order dominant indigenous Indian Hindi another challenge in itself. The scarcity of resources tools such annotated corpora, lexicons, dependency parser, Part-of-Speech tagger, benchmark datasets engorge the linguistic challenges low-resource languages like Hindi. Furthermore, context incongruity imperative detect sarcasm, various linguistic, aural visual cues can be used predict target utterance sarcastic. While pre-trained word embeddings capture meanings, semantic relationships different types contexts form representations, emojis also render useful contextual information, analogous human facial expressions, for gauging sarcasm. Thus, goal this research demonstrate use hybrid deep learning model trained using two embeddings, namely emoji validated on tweets dataset, Sarc-H, manually with sarcastic non-sarcastic labels. preliminary results clearly depict importance detection, our attaining an accuracy 97.35% F-score 0.9708. validates that automated feature engineering facilitates efficient repeatable predictive detecting indigenous, languages.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tweet Sarcasm Detection Using Deep Neural Network

Sarcasm detection has been modeled as a binary document classification task, with rich features being defined manually over input documents. Traditional models employ discrete manual features to address the task, with much research effect being devoted to the design of effective feature templates. We investigate the use of neural network for tweet sarcasm detection, and compare the effects of t...

متن کامل

Melanoma detection with a deep learning model

Background: Skin cancer is one of the most common forms of cancer in the world and melanoma is the deadliest type of skin cancer. Both melanoma and melanocytic nevi begin in melanocytes (cells that produce melanin). However, melanocytic nevi are benign whereas melanoma is malignant. This work proposes a deep learning model for classification of these two lesions.    Methods: In this analytic s...

متن کامل

Clickbait detection using word embeddings

Clickbait is a pejorative term describing web content that is aimed at generating online advertising revenue, especially at the expense of quality or accuracy, relying on sensationalist headlines or eyecatching thumbnail pictures to attract click-throughs and to encourage forwarding of the material over online social networks. We use distributed word representations of the words in the title as...

متن کامل

Is deep learning really necessary for word embeddings?

Word embeddings resulting from neural language models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with some wellknown embeddings ...

متن کامل

Concept drift detection in business process logs using deep learning

Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Asian and Low-Resource Language Information Processing

سال: 2023

ISSN: ['2375-4699', '2375-4702']

DOI: https://doi.org/10.1145/3519299